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Fast  Parallel  String  Prefix-Matching 


S 

Abstract 

An  O(loglogm)  time  -processor  CRCW-PRAM  algorithm  for  the  string 

prefix-matching  problem  over  a  general  alphabet  is  presented.  The  algorithm  can 
also  be  used  to  compute  the  KMP  failure  function  in  O(loglogm)  time  on  i"\°g^ 
processors.  These  results  improve  on  the  running  time  of  the  best  previous  algorithm  for 
both  problems,  which  was  O(Iogm),  while  preserving  the  same  number  of  operations. 
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1  Introduction 


String  matching  is  the  problem  of  finding  all  occurrences  of  a  short  pattern  string  V[\..m ]  in 
a  longer  text  string  T[l..n].  The  classical  sequential  algorithm  of  Knuth,  Morris  and  Pratt 
[12]  solves  the  string  matching  problem  in  time  that  is  linear  in  the  length  of  the  input 
strings.  The  Knuth-Morris-Pratt  [12]  string  matching  algorithm  can  be  easily  generalised  to 
find  the  longest  pattern  prefix  that  starts  at  each  text  position  within  the  same  time  bound. 
We  refer  to  this  problem  as  string  prefix-matching. 

In  parallel,  the  string  matching  problem  can  be  solved  in  O(loglogm)  time  on  a  logl"grn- 
processor  CRCW-PRAM  as  shown  by  Breslauer  and  Galil  [7].  However,  the  best  parallel 
algorithms  for  the  string  prefix-matching  problem  and  for  computing  the  KMP  failure  func¬ 
tion  were  simple  derivations  of  Galil’s  [11]  O(logm)  time  n-processor  string  matching  algo¬ 
rithm.  (The  KMP  failure  function  is  a  table  that  is  computed  in  the  pattern  processing  step 
of  the  Knuth-Morris-Pratt  string  matching  algorithm  and  is  used  to  guide  that  algorithm 
when  comparisons  fail.)  These  bounds  are  over  a  general  alphabet  where  the  only  access 
an  algorithm  has  to  the  input  strings  is  by  pairwise  symbol  comparisons.  In  fact,  Galil’s 
[11]  algorithm  can  be  implemented  using  only  processors  if  the  size  of  the  alphabet  is 
a  constant. 

This  paper  presents  a  new  algorithm  for  the  string  prefix-matching  problem  over  a  general 
alphabet.  The  algorithm  takes  O(loglogm)  time  on  a  -processor  CRCW-PRAM.  It  is 

also  shown  that  this  algorithm  can  be  used  to  compute  the  KMP  failure  function  of  a  string 
■p[l..m]  in  O(loglogm)  time  on  p-1”6-—  processors. 
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A  parallel  algorithm  is  said  to  achieve  an  optimal  speedup  if  its  time-processor  product  is 
the  same  as  the  running  time  of  the  fastest  sequential  algorithm.  The  new  algorithms  that 
are  presented  in  this  paper  are  still  a  factor  of  log  m  processors  away  from  optimality,  but 
they  have  the  same  time-processor  product  as  the  best  previous  parallel  algorithms  [11]  for 
the  two  problems.  Both  algorithms  are  the  fastest  possible  with  the  number  of  processors 
used  as  implied  by  a  lower  bound  that  was  given  by  Breslauer  and  Galil  [8]  for  the  string 
matching  problem.  Note  that  both  problems  can  be  solved  even  in  a  constant  time  if  more 
processors  are  available. 

The  string  prefix  matching  algorithm  follows  techniques  that  were  used  in  solving  several 
other  parallel  string  problems  [1,  2,  5,  6,  9].  In  particular,  it  uses  the  parallel  string  match¬ 
ing  algorithm  of  Breslauer  and  Galil  [7]  as  a  procedure  that  solves  several  string  matching 
problems  simultaneously  and  then  combines  the  results  of  the  string  matching  problems  into 
an  answer  to  the  string  prefix-matching  problem. 

The  paper  is  organized  as  follows.  Section  2  overviews  some  parallel  algorithms  and  tools 
that  are  used  in  the  new  algorithms.  Section  3  describes  the  prefix-matching  algorithm  and 
Section  4  shows  how  to  use  that  algorithm  to  compute  the  KMP  failure  function. 

2  The  CRCW-PRAM  Model 

The  algorithms  described  in  this  paper  are  for  the  concurrent- read  concurrent-write  parallel 
random  access  machine  model.  We  use  the  weakest  version  of  this  model  called  the  common 
CRCW-PRAM.  In  this  model  many  processors  have  access  to  a  shared  memory.  Concurrent 
read  and  write  operations  are  allowed  at  all  memory  locations.  If  several  processors  attempt 
to  write  simultaneously  to  the  same  memory  location,  it  is  assumed  they  always  write  the 
same  value. 

The  prefix  matching  algorithm  uses  a  string  matching  algorithm  as  a  procedure  to  find  all 
occurrences  of  a  given  pattern  in  a  given  text.  The  input  to  the  string  matching  algorithm 
consists  of  two  strings,  pattern[\..m]  and  fexf[l..n],  and  the  output  is  a  Boolean  array 
match[\..n)  that  has  a  “true”  value  at  each  position  where  an  occurrence  of  the  pattern 
starts  in  the  text.  We  use  Breslauer  and  GaliPs  [7]  parallel  string  matching  algorithm  that 
takes  0(log  log m)  time  on  a  -processor  CRCW-PRAM.  This  algorithm  is  the  fastest 

optimal  parallel  string  matching  algorithm  possible  over  a  general  alphabet  as  shown  by 
Breslauer  and  Galil  [8]. 

We  also  use  an  algorithm  of  Fich,  Ragde  and  Wigderson  [10]  to  compute  the  minima  of 
n  integers  from  the  range  1  •  •  •  n  in  a  constant  time  using  an  n-processor  CRCW-PRAM.  We 
use  this  algorithm,  for  example,  to  find  the  first  occurrence  of  a  string  in  an  other  string; 
After  all  the  occurrences  are  computed  by  the  string  matching  algorithm  mentioned  above, 
the  minima  algorithm  is  used  to  find  the  smallest  i  such  that  match[i\  =  “true”. 

For  the  computation  of  the  KMP  failure  function  we  use  an  algorithm  that  computes  the 
prefix  maxima  of  a  sequence.  Berkmen,  Schieber  and  Vishkin  [3]  noticed  that  the  parallel 
maxima  algorithm  of  Shiloach  and  Vishkin  [14]  can  be  modified  to  find  the  maxima  of  each 
prefix  of  an  n  element  sequence  in  O(loglogn)  time  on  a  -processor  CRCW-PRAM. 
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One  of  the  major  issues  in  the  design  of  a  PRAM  algorithms  is  the  assignment  of  proces¬ 
sors  to  their  tasks.  We  ignore  this  issue  in  this  paper  and  use  a  general  theorem  that  states 
that  the  assignment  can  be  done. 

Theorem  2.1  (Brent  [4])  Any  synchronous  parallel  algorithm  of  time  t  that  consists  of  a 
total  of  x  elementary  operations  can  be  implemented  on  p  processors  in  \xjp\  +  t  time. 

This  theorem  can  be  used  for  example  to  slow  down  a  constant  time  p-processor  algorithm 
to  work  in  time  t  using  p/f  processors.  Coming  back  to  the  example  above,  that  finds  the 
first  occurrence  of  one  string  in  another,  one  sees  that  the  second  step  of  finding  the  smallest 
index  of  an  occurrence  takes  a  constant  time  on  n  processors,  while  the  call  to  the  string 
matching  procedure  takes  0(log  log  m)  time  on  IogI"8tn  processors.  By  Theorem  2.1  the 
second  step  can  be  slowed  down  to  work  in  0(log  log  m)  time  on  ^  processors. 

As  mentioned  in  the  introduction,  the  string  prefix  matching  problem  can  be  solved  faster 
if  more  processors  are  available. 

Theorem  2.2  The  strinq  prefix-matchinq  problem  takes  a  constant  time  on  a  nm-processor 
CRCW-PRAM. 

Proof:  The  following  trivial  string  prefix  matching  algorithm  takes  a  constant  time. 

•  Assign  m  processors  to  each  text  position  to  find  the  length  of  the  longest  pattern 
prefix  that  starts  at  that  position.  Each  of  the  m  processors  simultaneously  compares 
the  symbols  of  the  pattern  with  the  corresponding  symbols  of  the  text. 

•  Find  the  position  of  the  first  comparison  that  failed  in  each  group  of  m  comparisons 
that  were  assigned  to  specific  text  position.  The  successful  comparisons  up  to  the  first 
comparison  that  failed  correspond  to  the  longest  pattern  prefix  that  occurs  starting  at 
this  text  position. 

This  step  takes  a  constant  time  on  m  processors  using  the  Fich,  Ragde  and  Wigderson 
[10]  integer  minima  algorithm. 

Since  there  are  m  processors  assigned  to  each  of  the  n  text  positions  the  total  number  of 
processors  used  is  nm.  □ 

3  The  Prefix-Matching  Algorithm 

We  describe  an  algorithm  that  given  the  text  string  T[l..n]  and  the  pattern  string  V[\..m] 
will  compute  the  longest  pattern  prefix  that  occurs  starting  at  each  text  position.  The 
output  will  be  an  array  $[l..n]  such  that  T[i..i  +  $[i]  —  1]  =  V[l. .$[?]]  and  if  $[i]  <  m,  then 
T{i  +  $[z]]  ^  V[$[i]  +  1].  Using  this  notation,  if  $[i]  =  m,  then  a  complete  occurrence  of  the 
pattern  starts  at  text  position  i. 
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Theorem  3.1  There  exist  an  algorithm  that  given  the  input  strings  T[l..n]  and  V[\..m\, 
will  compute  the  longest  pattern  prefix  that  starts  at  each  text  position  in  O(loglogm)  time 
071  Processors- 

Proof:  To  simplify  the  presentation  assume  without  loss  of  generality  that  the  algorithm 
can  access  indices  of  the  input  strings  which  are  out  of  the  string  boundaries  and  that  all 
comparisons  to  these  symbols  fail.  AH  entries  of  the  output  array  $[l..n]  are  initialized  to 
be  zero. 

The  algorithm  will  proceed  in  independent  stages  which  are  computed  simultaneously. 
In  stage  number  77,  0  <  tj  <  [logmj,  the  algorithm  computes  all  entries  $[i]  of  the  output 
array  such  that  2V  <  $[*’]  <  2T,+l.  Note  that  the  each  stage  computes  disjoint  ranges  of  the 
output  array  values  and  that  all  possible  values  are  covered. 

We  denote  by  Tn  the  time  it  takes  to  compute  stage  number  g  on  Pv  processors.  The 
number  of  operations  in  stage  77  is  Ov  =  TvPn.  In  the  next  section  it  is  shown  that  each  stage 
tj  can  be  computed  in  Tv  =  0(loglog2’’)  time  and  Ov  =  0(n)  operations  using  Breslauer 
and  Galil’s  [7]  parallel  string  matching  algorithm. 

Since  the  stages  of  the  algorithm  are  computed  simultaneously,  the  total  number  of  oper¬ 
ations  performed  in  all  stages  is  £f ,  Ov  =  O(nlogm)  and  the  time  is  ma.xTv  =  O(loglogm). 
By  Theorem  2.1  the  algorithm  can  be  implemented  in  (9 (log  log  m)  time  on  1"g1°pgT^1  proces¬ 
sors.  -□ 


3.1  A  Single  Stage 

This  section  describes  a  single  stage  77,  0  <  g  <  [logmj,  that  computes  all  values  of  the 
output  array  $  that  are  in  the  range  2V  ■  ■  •  2v+l  —  1 ,  in  0(log  log  2’’)  time  and  n  operations. 

Stage  number  g  starts  with  a  call  to  a  string  matching  algorithm  to  find  all  occurrences 
of  the  pattern  prefix  ■p[1..2n]  in  the  text.  Note  that  a  pattern  prefix  which  is  long  enough  to 
be  in  the  range  that  has  to  be  computed  by  this  stage  can  only  start  at  these  occurrences. 
In  the  rest  of  this  section  we  show  how  to  find  efficiently  the  maximal  length  of  the  pattern 
prefixes  that  start  at  each  of  these  occurrences  or  to  verify  that  the  prefixes  are  long  enough 
to  be  computed  by  another  stage. 

If  an  occurrence  is  found  starting  at  some  text  position  q ,  then  the  algorithm  knows  th*t 
a  p-  ttern  prefix  whose  length  is  at  least  2V  starts  at  that  text  position.  Similarly  to  Theorem 
2.2,  using  only  2n  processors,  the  algorithm  can  find  in  a  constant  time  the  length  of  the 
pattern  prefix  that  starts  at  text  position  q  or  it  can  conclude  that  the  prefix  is  at  least  of 
length  2T,+1  and  therefore  out  of  the  range  that  has  to  be  computed  by  this  stage. 

This  last  step  is  very  efficient.  However,  since  there  can  be  many  occurrences  of  V\  1..2’’] 
in  the  text,  repeating  this  step  for  all  these  occurrences  can  be  too  costly.  We  restrict  our 
attention  to  a  small  part  of  the  text  string  and  solve  the  problem  simultaneously  in  each 
part.  This  allows  us  to  use  some  periodicity  properties  of  strings  which  are  described  below. 

We  partition  the  text  string  T[l..n]  into  consecutive  blocks  of  length  [2’’~1J  +  1  each. 
For  the  rest  of  this  section  we  restrict  our  attention  to  a  single  block.  Let  qi,  i  =  l..r,  be 
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the  indices  of  all  occurrences  of  the  pattern  prefix  that  start  at  text  positions  in  one 

such  block. 

Definition  3.2  A  string  S  has  a  period  u  if  S  is  a  prefix  of  uk  for  some  large  enough  k. 
The  shortest  period  of  a  string  S  is  called  the  period  of  S.  Alternatively,  a  string  <S[l..m] 
has  a  period  of  length  x  if  S[i]  =  5[i  +  7r],  for  i  =  l..m  —  x. 

Lemma  3.3  (Lyndon  and  Schutzenberger  [13])  If  a  string  of  length  m  has  two  periods  of 
lengths  p  and  q  and  p  +  q  <  m,  then  it  also  has  a  period  of  length  gcd (p,  q). 

Lemma  3.4  .Assume  that  the  period  length  of  a  string  A[1..Z]  is  p.  If  A[l..l\  occurs  only 
at  positions  p\  <  p2  <  •  •  •  <  Pk  of  a  string  B  and  Pk  —  Pi  <  \ ,  then  the  pi ’s  form  an 
arithmetic  progression  with  difference  p. 

Proof:  Assume  k  >2.  We  prove  that  p  =  p,+i  —  p,-,  for  i  =  1  •  •  •  k  —  1.  The  string  A[1..Z] 
has  periods  of  lengths  p  and  q  =  p,+i  —  p,-.  Since  p  <  q  <  by  Lemma  3.3  it  also  has  a 
period  of  length  gcd(p,q).  But  p  is  the  length  of  the  shortest  period  so  p  =  gcd(p,  q)  and  p 
must  divide  q.  The  string  B[p;..p,+i  -f  l  —  1]  has  period  of  length  p.  If  q  >  p,  then  there  must 
be  another  occurrence  of  A  at  position  p,-  +  p  of  B\  a  contradiction.  □ 

Lemma  3.5  The  sequence  {<?,},  which  is  defined  above,  forms  an  arithmetic  progression 
with  difference  x,  where  x  is  the  period  length  ofV[1..2v\. 

Proof:  The  sequence  {<?,-}  lists  the  indices  of  all  occurrences  of  'P[1..2,,]  that  start  in  a  text 
block  of  length  |2”-1J  +1.  By  Lemma  3.4  the  qfs  form  an  arithmetic  progression  with 
difference  x,  the  period  length  of  ■p[1..2’7].  □ 

The  sequence  {<7,}  can  be  represented  using  three  integers:  the  start,  the  difference,  and 
the  length  of  the  sequence.  This  representation  can  be  easily  computed  from  the  output  of 
the  string  matching  problem  using  Fich,  Ragde  and  Wigderson’s  [10]  minima  algorithm  in  a 
constant  time  and  2”  processors. 

Let  ip  be  the  position  where  the  period  V[\..x]  of  P[1..2,?]  terminates  in  the  pattern  prefix 
V[\..2v+l ]  and  2V+1  +  1  if  it  does  not  terminate  in  this  prefix.  Let  9  be  the  position  where  the 
period  of  V[\..2V)  terminates  in  the  text  substring  T[qr..qr  +  2v+l  —  1]  and  qr  +  2n+1  if  it  does 
not  terminate  in  this  substring.  By  terminated  periodicity  we  mean  that  —  x  —  I]  = 

V\x  -f  1..0  —  1]  and  —  x]  ^  V{x[]  and  that  T[qi..O  —  x  —  l]  =  T\q ,•  +  x..9  —  1]  and 
T[9  —  x)  ^  ^[9].  The  indices  ip  and  9  can  be  computed  in  a  constant  time  on  2n  processors. 

If  the  sequence  {<?,•}  has  only  a  single  element  <71,  the  algorithm  can  find  the  length  of  the 
pattern  prefix  that  starts  at  text  position  <71  using  the  approach  which  is  described  before. 
Otherwise,  if  the  sequence  has  more  than  one  element,  the  algorithm  finds  the  length 
of  the  pattern  prefixes  that  start  at  text  positions  in  {^}  as  described  next  in  Lemma  3.6. 
The  algorithm  might  still  be  required  to  use  the  approach  that  was  described  before  to  find 
the  length  of  the  pattern  prefix  that  starts  at  one  of  the  {gj  text  positions. 
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Lemma  3.6  Let  A  =  min(#  —  <7,-,0  —  1).  Then,  the  longest  pattern  prefix  that  starts  at  text 
position  qi  is  at  least  of  length  A.  Furthermore, 


1.  If  9  —  qi  0  —  1,  then  the  length  of  that  prefix  is  exactly  A. 

2.  If  6  —  g,  =  0  —  1,  then  that  prefix  can  continue  to  any  length  and  it  is  necessary  compare 
more  symbols  to  compute  its  length. 

Note  that  at  most  one  of  the  qi ’s  can  fall  under  this  category. 

Proof:  Both  the  pattern  prefix  V[\..ip  —  1]  and  the  text  substring  T[<?,\.0  —  1]  have  period 
V[l..x\,  the  period  of  the  pattern  prefix  P[1..2”].  Therefore,  it  is  clear  that  P[1..A]  = 
T[qi..qi  +  A  -  1]. 

1.  If  9  —  qi  0  —  1,  then  either, 

V[\  +  1]  =  V[X  -  a-  +  1]  and  T[qi  +  A]  #  Tfo  +  A  -  tt] 
or, 

V[X  -f  1]  7^  V[\  —  7r  +  1]  and  T[qi  +  A]  =  T[g,-  +  A  —  7r], 

Since  A  >  2n  >  7r  and  V[\  —  tt  +  1]  =  T[g,-  +  A  —  7r],  in  both  cases  V[\  +  1]  ^  T[qi  +  A] 
proving  that  the  length  of  the  pattern  prefix  that  starts  at  text  position  qi  is  exactly 
A. 

2.  If  6  —  qi  =  0  —  1,  then  it  suffices  to  compare  P[1..2’,+1]  to  +  2v+l  —  1]  to  find 

the  length  of  the  pattern  prefix  that  starts  at  text  position  qi  or  to  conclude  that  the 
prefix  is  at  least  of  length  2T,+1  and  therefore  out  of  the  range  that  has  to  be  computed 
by  this  stage. 

The  extra  comparisons  are  necessary  since,  if  A  <  2n+1,  then  V[X  +  1]  ^  V[X  —  7r  +  1] 
and  T[qi  +  A]  ^  T[<jr,-  +  A  —  7r]  and  it  is  possible  that  V[X  +  1]  =  T[qi  4-  A], 

□ 

The  computation  in  stage  rj  proceeds  in  each  text  block  of  length  |_2T,-1J  +  1  simultaneously 
and  can  be  summarized  as  follows: 

1.  Find  all  occurrences  of  the  pattern  prefix  P[1..2n]  in  the  considered  text  block  and 
compute  the  {<?,■}  sequence. 

2.  Compute  the  period  length  it  of  the  pattern  prefix  V[\..2T']. 

3.  Compute  6  and  0. 
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Figure  1:  If  0  —  q{  <xp  —  1,  then  all  pattern  prefixes  that  start  at  text  positions 
{qi}  terminate  at  the  same  text  position. 


4.  Find  the  length  of  the  pattern  prefix  that  starts  at  each  text  position  qi.  By  Lemma 
3.6  the  length  is  given  by  9  and  ip  except  for  at  most  one  of  the  qfs  that  has  to  be 
found  separately. 

If  the  length  of  the  pattern  prefix  that  starts  at  text  position  qi  is  out  of  the  range  that 
.  has  to  be  computed  by  this  stage  do  not  update  the  output  array  entry  $[<7,-]  since  it 
will  be  updated  in  another  stage. 

Lemma  3.7  Stage  number  7  correctly  computes  all  entries  of  the  output  array  $[l..n]  that 
are  in  the  range  2V  •  ■  ■ 2V+1  —  1  .It  takes  O(loglog2’’)  time  and  a  total  of  n  operations. 

Proof:  The  calls  to  Breslauer  and  Galil’s  [7]  string  matching  algorithm  take  0(log  log  2'1) 
time  and  n  operations. 

The  sequence  qi  can  be  represented  by  three  integers  which  can  be  computed  from  the 
output  of  the  string  matching  algorithm  (that  is  assumed  to  be  a  Boolean  vector  representing 
all  occurrences)  in  a  constant  time  and  2V  operations  in  each  block.  The  rest  of  the  work  in 
each  block  also  takes  a  constant  time  and  2V  operations. 

There  are  blocks  and  thus,  stage  7  takes  0( log  log  2’’)  time  and  O(n)  operations. 

□ 

4  The  KMP  Failure  Function 

The  Knuth-Morris-Pratt  [12]  string  matching  algorithm  computes  in  its  pattern  preprocess¬ 
ing  step  a  table  that  is  used  later  to  guide  the  text  processing  step  when  comparisons  fail. 
This  table  is  often  called  the  KMP  failure  function. 

Knuth,  Morris  and  Pratt  [12]  actually  define  two  function:  lF[l..m]  and  next[\..m).  Both 
function  can  be  used  to  guide  the  comparisons  that  fail,  but  the  nexf[]  function  has  more 
information  and  therefore  it  is  more  efficient.  In  this  section  we  show  that  using  the  string 
prefix-matching  algorithm  one  can  compute  both  functions  efficiently. 


7 


Both  the  J-\\  and  the  next\\  functions  are  strongly  related  to  the  periods  of  the  pattern 
prefixes  and  are  actually  a  simple  shift  of  the  IIQ  and  fl[]  functions  of  the  pattern  that  are 
defined  next: 

•  Given  a  string  S[l..m],  the  function  II[l..m]  is  defined  for  <S[l..m]  such  that  II[i]  is  the 
period  length  of  the  prefix  <?[l..t], 

A  A 

•  Given  a  string  5[l..m],  the  table  II[l..m]  is  defined  for  <S[l..m]  such  that  II[z]  is  the 
length  of  the  shortest  terminated  period  at  position  i  of  5[l..m]  if  such  a  period  exists. 

That  is,  n[i]  is  the  length  of  the  shortest  period  of  S[l..i  —  1]  that  is  not  a  period  of 
S[l..i].  If  all  periods  of  S[l..i  —  1]  are  also  periods  of  <S[l..i]  then  fl[t]  is  undefined. 

Theorem  4.1  The  function  II[l..mj  can  be  computed  in  O(loglogm)  time  on  a  - 

processor  CRCW-PRAM. 

Proof:  The  algorithm  will  start  by  solving  a  string  prefix-matching  problem  with  the  input 
string  <S[l..m]  given  as  both  pattern  and  text.  The  output  of  the  string  prefix-matching 
problem  contains  essentially  all  the  information  needed  for  the  II[l..m]  function.  Note  that 
an  integer  A;  is  a  period  length  of  all  prefixes  <S[l..i]  such  that  k  <  i  <  &  +  $[&-(- 1].  Therefore, 

II[z']  =  min{A:  |  1  <  k  <  i  <  k  +  $[A:  +  1]}- 

We  show  that  II[l..m]  can  be  computed  on  a  CRCW-PRAM  in  O(loglogm)  time  on  jos^g~ 
processors  if  $[l..m]  is  given.  The  computation  follows  three  steps: 

1.  Compute  a  function  AC[l..m]  such  that, 

lC[i\  =  max{A:  +  <b[k  +  1]}. 

Ki 

Using  this  notation  an  integer  i  is  the  period  length  of  all  prefixes  <S[1..A:]  such  that 
/C[i  —  1]  <  k  and  k  <  /C[i]. 

2.  Compute  a  function  B[\..m)  such  that  B[K\i  —  1]  + 1]  =  i  if  AC[z]  >  K[i  —  1]  and  B[k]  =  0 
otherwise. 

3.  Compute  the  II[l..m]  function. 

n[i]  =  ma?c{S[A:]}. 

Note  that  both  maxima  computations  can  be  done  by  Berkman,  Schieber  and  Vishkin’s  [3] 
prefix  maxima  algorithm.  □ 

For  the  computation  of  the  n[l..m]  function  we  use  a  more  powerful  CRCW-PRAM 
model  which  is  called  the  priority  CRCW-PRAM.  In  this  model  each  processor  has  a  pre¬ 
assigned  priority  and  simultaneous  writes  of  different  values  to  a  memory  cell  are  allowed. 
The  actual  value  written  is  that  of  the  processor  with  the  highest  priority. 
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Theorem  4.2  77ie  function  IT[l..m]  can  be  computed  in  £>(loglogm)  time  on  a 
processor  priority  CRCW-PRAM. 


m  Iogm 
log  log  m 


Proof:  The  algorP' m  will  start  by  solving  a  string  prefix-matching  problem  with  the  input 
string  5[l..m]  given  as  both  pattern  and  text.  The  output  of  the  string  prefix-matching 
problem  contains  essentially  all  the  information  needed  for  the  function.  Note  that 

a  period  of  length  k  terminates  at  position  k  +  $[k  + 1]  -f- 1  of  the  input  string  <S[l..m].  Thus, 

ft[»]  =  min{A:  |  i  =  k  +  +  1]  4-  1}. 

The  n[l..m]  array  can  be  computed  in  a  constant  time  on  a  priority  CRCW-PRAM  once 
that  $[l..m]  is  given: 

1.  Initialize  all  entries  of  the  n[l..m]  array  to  be  undefined. 

2.  For  each  integer  k,  1  <  k  <  m  assign  a  processor  with  priority  k  that  attempts  to  write 
the  value  k  into  fl[A:  +  $[&  +  1]  +  1]. 

If  the  write  conflict  are  resolved  in  such  a  way  that  the  processor  with  the  smallest 
priority  value  succeeds  in  writing  at  each  memory  location,  then  the  computation  of  f[[l..m] 
is  complete.  □ 
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